Skip to main content

Scaling & Performance in Backend Systems (Part 3)

Statelessness (Key to Horizontal Scaling)

  • Statelessness = No server instance holds exclusive data.

  • In horizontal scaling:

    • Multiple servers run the same code.
    • Any request can go to any server.
  • Requirement:

    • All servers must behave identically for any request.

Why Statelessness is Important

  • If one server stores unique data:

    • Other servers cannot access it

    • Leads to:

      • Errors
      • Inconsistent behavior
  • Example:

    • Server A has user session
    • Request goes to Server B → session missing → ❌ failure

Rule of Stateless Systems

  • Never store state inside a server instance
  • Always store state in shared external systems

Common Stateless Design Patterns

1. Session Management

  • ❌ Wrong:

    • Store session in server memory
  • ✅ Correct:

    • Store session in shared storage:

      • Redis (in-memory DB)

2. File Storage

  • ❌ Wrong:

    • Save files on server disk
  • ✅ Correct:

    • Use shared object storage:

      • S3 / Cloud storage

3. Database

  • ❌ Wrong:

    • Local DB (SQLite on server)
  • ✅ Correct:

    • Centralized DB (Postgres, MySQL)

Load Balancer (Core Component)

Purpose

  • Distributes incoming requests across servers

How It Works

  • Client → Load Balancer → Server → Response → Load Balancer → Client

  • Load balancer decides:

    • Which server handles each request

Load Balancing Algorithms

1. Round Robin

  • Requests distributed sequentially:

    • A → B → C → A → B → C

Best when:

  • Requests are similar
  • Servers have equal capacity

Problem with Round Robin

  • Cannot differentiate:

    • Light vs heavy requests
  • May overload one server


2. Weighted Round Robin

  • Servers get traffic based on capacity

Example:

  • Server A (2x capacity) → gets 2x requests

3. Least Connections

  • Sends request to server with:

    • Fewest active connections

Better for:

  • Mixed workloads (light + heavy requests)

4. Other Algorithms

  • Least response time
  • Resource-based (CPU/RAM usage)

Handling Server Failures

Problem

  • Load balancer may still send traffic to dead server

Solution: Health Checks

  • Load balancer sends periodic test requests

  • If server fails:

    • Marked unhealthy
    • Removed from routing
  • When server recovers:

    • Added back automatically

Database Scaling Challenge

  • Backend scaling is easy (stateless)

  • Database is:

    • Stateful
    • Harder to scale

Read Replicas

Concept

  • One primary DB (handles writes)
  • Multiple replicas (handle reads)

Benefits

  • Reduces load on primary DB
  • Improves latency (geo-distribution)

Request Distribution

  • ~70–90% reads → replicas
  • Writes → primary

Problem: Replication Lag

  • Data replication takes time (e.g., 200 ms)

Issue Example

  1. User updates name → primary DB
  2. Immediately fetches data → replica
  3. Replica not updated yet → stale data ❌

Solutions to Replication Lag

  • Route reads to primary after write
  • Delay read requests
  • Track replication lag
  • Frontend delay (controlled fetch timing)

Sharding (Partitioning)

Concept

  • Split large table into multiple DB instances

Example

  • Orders table split by:

    • Date (Jan–Jun, Jul–Dec)

Benefits

  • Smaller datasets → faster queries
  • Multiple DB instances → higher throughput

Key Challenge

  • Choosing shard key

    • Example:

      • Date
      • User ID

Distributed Databases (Modern Trend)

Examples

  • PlanetScale
  • Neon
  • CockroachDB
  • Yugabyte

Benefits

  • Handle:

    • Replication
    • Sharding
    • Scaling
  • Managed by provider


Practical Advice

  • Don’t build your own DB infra early

  • Use managed services:

    • AWS RDS
    • GCP SQL
    • Neon

CDN (Content Delivery Network)

Purpose

  • Reduce latency caused by:

    • Physical distance

Physics Limitation

  • Speed of light → ~100 ms minimum latency (long distance)

CDN Solution

  • Place servers (edge nodes) near users

Benefits

1. Reduced Latency

  • From:

    • ~100 ms → ~2–3 ms

2. Reduced Server Load

  • CDN serves cached content
  • Origin server gets fewer requests

What to Cache in CDN

1. Static Content

  • JS, CSS, HTML
  • Images, videos, fonts

2. API Responses

  • Example:

    • Product catalog

CDN Cache Invalidation

  • Purge cache using:

    • Tags
    • Events

CDN for Security

DDoS Protection

  • CDN absorbs malicious traffic

  • Prevents:

    • Server crash
    • Cost explosion

Edge Computing

  • CDN nodes = edge of network
  • Located near ISPs
  • Serve content instantly

Final Takeaways

  • Statelessness enables horizontal scaling

  • Load balancers distribute traffic intelligently

  • Health checks prevent failures

  • Databases scale via:

    • Replication
    • Sharding
  • CDNs solve:

    • Latency
    • Load
    • Security